This document is a short analysis of the relationship between price apartments and ground living area. It is associated with this publication. This work is hosted in in this github repository.
See the code for a list of necessary libraries.
library(tidyverse)
library(rmarkdown) # You need this library to run this template.
library(epuRate) # Install with devtools: install_github("holtzy/epuRate", force=TRUE)
library(plotly) # Turn your ggplot2 interactive
library(hrbrthemes) # For good looking plots
library(DT) # To show tablesData often need to be prepared. Here I just select 100 random samples.
# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/2_TwoNum.csv", header=T, sep=",") %>% dplyr::select(GrLivArea, SalePrice)
# Keep a few lines
data <- data %>% sample_n(100)Here is a description of the relationship between prices and ground living area:
# Plot
p <- data %>%
mutate(text=paste("Apartment Number: ", seq(1:nrow(data)), "\nLocation: New York\nAny info you need..", sep="")) %>%
ggplot( aes(x=GrLivArea, y=SalePrice/1000, text=text)) +
geom_point(color="#69b3a2", alpha=0.8) +
ggtitle("Ground living area partially explains sale price of apartments") +
theme_ipsum() +
theme(
plot.title = element_text(size=12)
) +
ylab('Sale price (k$)') +
xlab('Ground living area')
# Turn it interactive
ggplotly(p, tooltip="text")In case your interested about a particular data point, here is the complete input dataset:
A work by Yan Holtz